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Ultracomputers  [Schwartz,  1979]  are  assemblages  of  processors  that  are  able 
to  operate  concurrently  and  can  exchange  data  through  communication  lines  in, 
say,  one  cycle  of  operation.  For  physical  reasons,  the  fan  in/out  of  the  processors 
must  be  limited.  This  imposes  restrictions  on  the  possible  communication 
schemes.  In  order  to  have  the  ultracomputer  operate  efficiently  as  a  whole,  it  is 
desirable  that  arbitrary  exchanges  of  information  between  the  processors  can  be 
effected  in  a  small  number  of  data  shifts. 

K  a  really  huge  ultracomputer  is  built,  it  would  be  nice  if  it  could  be  con- 
structed by  coupling  smaller  ultracomputers,  which  in  turn  are  assembled  from 
still  smaller  ultracomputers,  and  so  on.  It  will  be  shown  that  the  latter  desire 
conflicts  to  a  certain  extent  with  the  earlier  one. 

For  the  purposes  of  this  note,  a  paracomputer  is  a  sequence  of  directed 
graphs.  (Ultracomputers  are  paracomputers  satisfying  a  restriction  defmed 
below.)  Throughout  the  paper,  the  sequence  G^  ,  D  =  0,  1,...  stands  for  a  para- 
computer. Each  G^  is  a  pair  <Pj-j,Lq>,  where  P^  is  the  set  of  nodes  (or  "pro- 
cessors") of  Gj^,  and  L^  is  a  set  of  edges  (or  "lines")  <PpP2>  c  Pq  x  P^.  We 
define 

Nd=#Pd  {(the  size  of  G^)}, 

<J)D=max  #{<pi,p2>eLn)  |  pi  =  p  or  P2=p} 

pePo 

(the  maximal /on  inlout  in  Gj^, 

rD=CiyND. 

To  exclude  uninteresting  cases,  it  is  assumed  that  N^j-^.    (Here  and  in  the 

sequel,  where  limits  or  orders  of  magnitude  arc  concerned,  there  arc  always 
understood  to  be  with  respect  to  D-<».) 

For  a  paracomputer  to  be  an  ultracomputer,  the  following  requirement  is 
imposed: 


'Malheraatisdi  Centnira,  Amsterdam 


(UC)  *{*D  ^*  bounded  by  some  constant  (|>, 

Lemma  1.   (UC)  implies  that  F^  is  bounded. 

Proof:   Cq  =  #^)  =  #{<PpP2>  «  l-D  >  ^ 

7  2  #{<Pi.P2>  €  Lp  I  Pi=  P  or  P2=P}  =^72<t>D=7ND<t>D' 

so  Tp  =  Cj^Nj^  ^  —  <J>  j^,  which  by  (UC)  is  bounded. 

The  order  of  magnitude  of  the  number  of  data  shifts  required  to  obtain  an 
arbitrary  j)ermutation  on  Pj^  will  determine  how  "fast"  the  paracomputer  is.    In 

order  to  express  this  in  terms  of  the  graph  model,  we  must  go  through  some 
definitions.   The  set  of  basic  permutations  on  G^  is  defined  by 

BPq  =  {it:  TT  is  a  permutation  on  P^  | 

it(p)  =  p  or  <p,'n-(p)>  €  L^  for  all  p  €  Pj^. 

The  permutations  PERMq^'^  of  shift  depth  d,  d  ^  0,  are  inductively  defined 
by: 

PERM^)  =  {iTj},  where  tTj  stands  for  the  identity  permutation, 

PERM^-'l)  ={p  7r|p  €BPj5,  it ^ PERM^")}  - IJ  PERMq(^\ 

k=0 
(Note  that  BPj^  =  PERM^^^^  U  PERM^^^).) 

The  shift  depth  sdj^  (ir)  of  a  permutation  ir  on  P^  is  defined  by 

IT  e  PERMd^ 

This  definition  may  leave  sdQ(Tr)  undefined  for  a  given  ir,  in  which  case  we  put 
sdj^('ir)  =  00. 

The  maximal  shift  depth  of  G^  is  now 

MD=max  sdi3('ir), 

where  it  ranges  over  all  permutations  on  P^.  (The  treatment  of  <»'s  should  be 
obvious.) 

A  paracomputer  is  called  f(N)-fast  if  M^  =  0(f(Nj^).    For  example,    the 
ultracomputer  as  defined  in  Schwartz  [1979]  has  N^  =  2°  and  Mj^  ^    4D-3  for 

D^l,  so  it  is  log  N-fast.  In  fact,  it  is  easily  seen  to  be  strictly  log  N-fast,  me'm- 
ing  that  it  is  log  N-fast  but  not  f(N)-fast  for  any  f(N)  =  o(log  N).  This  •:.  the 
best  possible  since  no  ultracomputer  can  improve  on  log  N-fastness.  *,'ote  that 
the  lower  orders  of  f(N)  correspond  to  faster  operation. 

Lemma  2.   Let  the  processors  P^  of  G^  be  partitioned  into  two  sets  S  and  T. 


Letn  =  min(#S,#T)  and  c  =  #(Lq  n  S  x  T).   Then  n  ^  M^.c. 

Proof:  Let  the  pennutations  on  P^  be  extended  in  the  natural  way  to  map  sub- 
sets of  pQ  on  subsets.   Define 

aCir)  =  #(7^(8)  n  T). 

We  will  first  show  that  for  ^  c  BP^  ,  a(p)  rs  c.  For 

aO)  =  #(B  (S)  n  T)    =  #{s  €  S|3  (s)€  T} 

=  #{s  €  S|<s,  3(s)>  €  Lj^n  S  X  T}  rs  #1^  H  S  X  T)  =  c. 

Let  IT  be  a  permutation  such  that  sdjj(Tr)  =  d.  It  is  claimed  that  a(iT)  ^  d  c. 
The  claim  is  easily  shown  correct  by  induction  on  d  (and,  in  fact,  we  have  just 
shown  it  for  the  case  d=l).   For  sdj^(Tr)  =  0,  it  =  tTj,  so 

a(iT)  =  #(Trj(S)  n  T)  =  #(S  n  T)  =  0. 
For  sdj^('ir)  =  0,  it  can  be  written  as  p  tt',  where 
sdj^(ir')  =  sdj^('Tr)  - 1  and  3  €  BPj^.   Since 

ir'CS)  =  ir'(S)  U  tt'CS)  D  T  C  S  U  ir'(S)  n  T, 

Tr(S)  =  3  Tr'(S)  =  3(ir'(S))  C  3  (S  U  it'(S)  HT)  C  3(S)  U  3(Tr'(S)  n  T), 

so  3  'it'(S)  n  t  c  3(S)  n  t  u  3('^'(S)  n)  t  n  t  c  3(S)  n  t  u  3('it'(S)  n  T). 

We  have 

a  (it)  =  a(3   'ir')  =  #(3     ir'CS)  DT)^  #(3  (S)n  T  U  3(11'  (S)n  T) 

:s  #(3  (S)  n  T)  +  #3(tt'(S)  n  T)  =  #(3(S)  n  T)  +  #('ir'(S)  n  T) 

=  a(3)  +  a(Tr'). 
Using  a(3)  ^  c,  sd^  (it')  =  sd^j  (tt)  -  1  and  the  inductive  hypothesis,  it  follows 
that 

a('iT)  ^  c  +  (sdj^(ir)  -  1)  c  =  sdj^c. 

Next,  choose  (arbitrarily)  two  subsets  S'  C  S  and  T'  C  T,  each  of  size  n.  Let  it 
be  any  permutation  such  that  17(8')  =  T'.   Then 

n  =  #T'=  #(17(8')  n  T')^  #(it(8)  nr)  =  a(-iT) 

so,  since  M^  is  an  upper  bound  of  the  values  of  sdpj  (it), 

n  ^  a(Tr)  ^  sdjj(iT)  c  ^  M^  c, 
which  proves  the  lemma. 

Remark.  Although  it  may  not  be  obvious  from  the  formalism  of  the  proof,  the 
crucial  idea  is  that  at  any  shift  3  at  most  c  items  from  8'  may  reach  (their  desti- 
nation in)  T  across  the  "boundary"  between  8  and  T.  It  follows  that  the  lemma 
will  also  hold  if  the  processors  are  not  forced  to  give  up  their  ciurent  contents  in 


passing  it  on  to  another  processor  and  receiving  data  from  a  third.  Even  an 
unlimited  memory  capacity  of  the  processors  will  not  help;  the  bottle-neck  is  not 
the  capacity  of  the  processors  but  that  of  the  lines. 

A  recurrent  paracomputer  is  a  paracomputer  obeying  a  recurrence  relation 

GD=<PD-iiU  •  •  •  UPd-v  L^ULo-iiU  •  •  •  ULo-i  >. 
In  this  scheme  the  processors  Pq-j^  of  constituent  paracomputers  G^-i^  are  con- 
sidered distinct  for  different  values  of  k,  even  if  i.  is  the  same  (by  taking  copies  if 
necessary),  so  the  unions  involved  are  disjoint  unions.   We  require,  moreover, 

n  ^  2  and  l  =  i)^  '^  ^    "^  K' 
(An  additional  requirement,  which  we  do  not  need  however,  might  be  that  L^  C 
Pd^Pd  is  disjoint  from  each  PD_i^xPD_j^.)  We  shall  write  I  for  i^^. 

To  get  the  sequence  started,  we  take  G^  =  <0,  0>  forD<  0  and  Gq  = 
<{A},  0>.  (A  stands  for  any  "atom"  to  label  the  processor  in  the  point  set  Pq, 
e.g.,  the  null  sequence.  For  the  following  considerations  the  choice  of  Pq  is 
immaterial,  as  long  as  Nq  >  0.  Moreover,  if  Nq  =  1,  the  choice  of  Lo"  is  imma- 
terial.) 

For  a  recurrent  paracomputer  we  have 

ND=Ofor  D<0; 

No=l; 
ND=iND_i^forD>0. 

k=l 

Obviously,  N^  is  strictly  monotone  increasing  for  D  ^  0.  The  solution  to  a 
recurrence  relation  of  this  type  can  be  written  explicitly  as 

ND=iajXP, 

j=i      ^ 

n      _. 
where  the  Xj  are  the  roots  of  the  equation  2)  ^    ^=1.   K  \  is  the  largest  of  these 

k-l 

roots,  we  have 

ND=aX°+0(n^  (1) 

for  some  positive  a  and  some  p.  such  that  |p.|<  X.  (If  there  is  a  multiple  root, 
the  general  explicit  solution  is  sUghtly  more  complicated.  We  are  concerned  with 
the  behavior  of  N^,  however,  and  it  can  be  shown  that  the  largest  root  is  larger 
than  1  and  exceeds  the  other  roots  in  absolute  magnitude,  and  so  has  multiplicity 
1.) 


Putting  Cd=#Ld  and  Cd=#Ld,  we  also  have 

Cd=0  for  D<0, 

CD=CD+i)Ci>_i^forD^O. 

k=l 

This  recurrence  relation  is  solved  by 

D 

CD=2ND-qCq.  (2) 

q=l 

(If  Lq  =^0,  the  summation  should  start  with  q  =  0.) 

To  give  an  example  of  a  recurrent  paracomputer,  consider 

The  superscripts  (0)  £ind  (1)  serve  to  distinguish  the  two  copies  of  Gj^j.  K  p  is  a 
processor  of  Pj^.,  the  corresponding  processors  of  P^i  and  P|>li  are  written  pO 
and  pi,  respectively.  Lq  is  then  defined  as 

{<pO,  pl>|  pePj^^}  U  {<pl,  pO>|  pcPj^.^}. 

So  Np  =  2^.  Since  <t>Q=  2D,  this  reciurent  paracomputer  is  not  an  ultracom- 
puter.  It  is  easily  shown  to  be  strictly  log  N-fast.  G^  is  isomorphic  to  a  hyper- 
cube  (with  edges  nmning  both  ways)  of  dimension  D. 

Theorem.  Recurrent  ultracomputers  are  not  log  N-fast. 

Proof:  By  contradiction.  Let  the  sequence  Gj^  be  a  log  N-fast  recurrent  ultra- 
computer.    We  have  M^^  =  0(D),  so  at  most  a  finite  number  of  the  values  of  M^ 

is  infmite.  If  this  should  be  the  case,  we  augment  the  corresponding  Lq  to  make 
Mj^  fmite.  This  does  not  influence  property  (UC).  Now,  for  some  a>0, 
MQ<aD. 

We  can  partition  P^^  into  two  sets,  S=PD-ij  and  T^Pq-j^JJ  •  •  •  UP^-j^. 
>From  I  =  max  i^,  k=l,...,n,  we  have  min(#S,#T)  2:  Nj^j.  Each  Ld_j^  con- 
tains members  of  Pd_j,xPq_j  only,  so  members  of  S  x  T  contained  in  Lj^  =  L^ 

U  L^i  U  ...U  Ld_j__  aie  members  of  Lt-^.  Consequently,  #(Lj^  H  SxT)  ^  #Ld 
=  Cj^.   Application  of  Lemma  2  yields  now 

Using  Mq  <  aD  and  (2),  we  obtain  for  F^ 


Since  Nj^   N  j  /  Nj^  -  \'^  we  are  led  to  rewrite  this  as 


1        ,  D    1        1    D    1 


Np-qNq-i 


Nt 


X' 


>From  (1)  it  is  dear  that  the  sum  in  the  second  term  has  a  finite  limit,  whereas 
the  first  term  is  clearly  unbounded,  so  Fj-j  is  unbounded.  Together  with  Lemma  1 
this  yields  a  contradiction. 

Remark.  The  possibility  is  still  left  open  that  recurrent  ultracomputers  might 
exist  that  are  (log  N)^"*"*  -  fast  for  arbitrarily  small  €  >  0.  Note  in  fact  that 
^q"^^"*"*^  is  bounded.  A  mere  existence  proof,  e.g.,  by  enumerating  combina- 
tions, would  not  be  very  helpful;  for  an  ultracomputer  to  be  manageable  the  lines 
should  definitely  exhibit  some  simple  pattern.  Note,  moreover,  that  the  criterion 
of  boundedness  of  T^  as  applied  is  relatively  weak;  for  example,  if  c^  is  constant, 
the  reasoning  in  the  proof  of  the  theorem  fails  completely  to  reveal  that  the 
corresponding  ultracomputer  is  at  best  N-fast,  for  no  contradiction  is  obtained 

concerning  the  boundedness  of  F^^  for  even  (log  N)^"^'  -fastness  (although  the 

contradiction  follows   immediately  from  the  intermediate   Nj^j   ^   M-x^.    It 

seems,  therefore,  entirely  plausible  that  the  result  of  this  note  could  be  drastically 
sharpened. 
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